Picture for Alice Oh

Alice Oh

KAIST

Not What, But How: A Communicative Audit of LLM Response Framing

Add code
Jun 01, 2026
Viaarxiv icon

JuICE: A Benchmark for Evaluating LLM-Judge in Identifying Cultural Errors

Add code
May 26, 2026
Viaarxiv icon

LoCar: Localization-Aware Evaluation of In-Vehicle Assistants through Fine-Grained Sociolinguistic Control

Add code
May 20, 2026
Viaarxiv icon

On the limits and opportunities of AI reviewers: Reviewing the reviews of Nature-family papers with 45 expert scientists

Add code
May 20, 2026
Viaarxiv icon

SemEval-2026 Task 7: Everyday Knowledge Across Diverse Languages and Cultures

Add code
May 04, 2026
Viaarxiv icon

Investigating Counterfactual Unfairness in LLMs towards Identities through Humor

Add code
Apr 20, 2026
Viaarxiv icon

FINEST: Improving LLM Responses to Sensitive Topics Through Fine-Grained Evaluation

Add code
Mar 04, 2026
Viaarxiv icon

A Neuropsychologically Grounded Evaluation of LLM Cognitive Abilities

Add code
Mar 03, 2026
Viaarxiv icon

MentalBench: A Benchmark for Evaluating Psychiatric Diagnostic Capability of Large Language Models

Add code
Feb 13, 2026
Viaarxiv icon

Cultural Compass: A Framework for Organizing Societal Norms to Detect Violations in Human-AI Conversations

Add code
Jan 12, 2026
Viaarxiv icon